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ABSTRACT 

The projected gravitational potential of galaxy clusters is reflected in both their X-ray 
emission and their imprint on the images of background sources due to their gravita¬ 
tional lensing effects. Since these projections of the potential are weighted differently 
along the line-of-sight, we propose a method to combine them and remove the degen¬ 
eracy between two cases: (i) a cluster consisting of a single potential well, or (ii) an 
apparent cluster composed of several potential wells projected onto each other. We 
demonstrate with simulated data of potential models that this method indeed allows 
to signihcantly distinguish multiple from single clusters. The confidence limit for this 
distinction depends on the mass ratio between the clusters. It ranges from 15 cr 
for mass ratio 1:1 to ~ 4 u for mass ratio 1:6. Furthermore, the method reconstructs 
the correct cluster mass, the correct mass ratio of the two clusters, and the correct 
scale radii with typical fractional accuracies of a few percent at 3 cr confidence. As an 
aside, our method allows to accurately determine gas fractions in clusters, also with 
3tT fractional accuracies of order a few percent. We argue that our method provides 
an alternative to the commonly used /3-fit technique, and yields more reliable results 
in a broader range of cases. 

Key words: galaxies: clusters: general — cosmology: gravitational lensing — X-ray: 
General — methods: statistical 


1 INTRODUCTION 

What physical objects do we call galaxy clusters? Are they 
typical large regions of extreme density enhancement? Or do 
they, as a class, constitute a sample of peaks in the apparent, 
line-of-sight projected density of galaxies. X-ray emission, or 
dark matter? If so, what do cluster samples defined by dif¬ 
ferent criteria have in common? Are the estimates of cluster 
abundance, and the inferences on cluster properties, mis¬ 
leading because of projection effects? What are our chances 
to identify and quantify projection effects? 

The paramount importance of galaxy clusters as probes 
for the cosmological evolution of density perturbations and 
structure formation makes the answers to these questions 
essential for any attempts at interpreting cluster samples. 

Despite their undoubted merits, cluster samples com¬ 
piled by subjective classification of two-dimensional galaxy- 
count enhancements (Abell 1958; Zwicky et al. 1968; Abell, 
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Corwin, & Olowin 1989) are most susceptible to being sta¬ 
tistically unfair. Identifying clusters by counting galaxies in 
an automated process {e.g. Dalton et al. 1994) marks a ma¬ 
jor improvement, but the so-defined samples are still subject 
to projection effects. 

The prevalence of projection effects is reduced in clus¬ 
ter samples selected by X-ray surface brightness {e.g. Gioia 
et al. 1990; Ebeling et al. 1996). Arising mostly from ther¬ 
mal bremsstrahlung, the X-ray emissivity is proportional 
to the squared electron density in the intracluster plasma. 
It is therefore a much more reliable measure of the three- 
dimensional rather than the projected density. Despite this 
welcome feature, there is still ample room for selection ef¬ 
fects to be important in some of the analyses based on such 
samples. 

Recently, van Haarlem, Frenk, & White (1997) demon¬ 
strated with simulations that projection effects are impor¬ 
tant even for cluster samples selected by X-ray emission. 
The line-of-sight integrated X-ray emission of these clusters 
is usually fitted with the three-parameter (5 model (Cav- 
aliere & Fusco-Femiano 1976). Unfortunately, conclusions 
from such fits suffer from projection effects and noise, and 
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usually cease to provide an adequate functional description 
of the dark-matter density profile on intermediate scales of 
projected radii. 

Our ability to recover the line-of-sight (l.o.s.) density 
structure is crucial for attempts at constraining cosmological 
parameters from cluster samples. It is also vitally important 
for any assessment of the physical properties of clusters, e.g. 
the degree of virialization and of hydrostatic equilibrium of 
the intracluster gas. For example, when rich clusters are se¬ 
lected for their strong gravitational leasing effects {i.e. their 
ability to form large arcs), and then analyzed with respect 
to their X-ray data to derive limits on the justihcation of 
the assumption of hydrostatic equilibrium (Miralda-Escude 
& Babul 1995; Loeb & Mao 1994), it may well be that se¬ 
lection effects play an important role, and that some of the 
conclusions can be relaxed by taking projection effects into 
account (Bartelmann & Steinmetz 1996). 

Another important application concerns the use of the 
Sunyaev-Zel’dovich effect {e.g. Sunyaev & Zel’dovich 1980; 
Rephaeli 1995). The effect can be used in tow alternative 
ways. First, we can assume the line-of-sight extent of the 
cluster gaseous component (e.g. by relating it to the angu¬ 
lar size using a Hubble constant). Then, by examining the 
distortion of the CMB spectrum, we can get limits for the 
gas content of the cluster and its temperature. On the other 
hand, we can assume the latter two (or estimate them differ¬ 
ently) and deduce the Hubble constant by the comparison 
of the angular and line-of-sight extent. In either case, the 
result depends strongly on the true l.o.s. gas prohle. If the 
latter is not well known, neither the Hubble constant nor the 
gas content can reliably be determined (see, e.g. , Roettiger 
et al. 1997; Holzapfel et al. 1997). 

Turning to clusters as tracers of the large-scale struc¬ 
ture, we are facing the same problem again. Knowledge of 
the l.o.s. cluster profile is important for attempts at deriv¬ 
ing the cluster abundance (White, Efstathiou, & Frenk 1993; 
Eke, Cole, & Erenk 1996; Viana & Liddle 1996), the clus¬ 
ter mass function {e.g. Bahcall & Cen 1993; Burns et al. 
1996), the spatial distribution of clusters (i.e. correlations, 
probability distributions etc., see Bahcall 1988 for a review), 
and the cluster velocity dispersion {e.g. Fadda et al. 1996; 
Mazure et al. 1996). 

In this paper, we propose an alternative to the tradi¬ 
tional /3-fit analysis and argue that, at least to some extent, 
degeneracies due to projection effects can be broken. The 
proposed alternative rests on a simple idea. In hydrostatic 
equilibrium, it must be possible to describe with the same 
gravitational potential all observable X-ray and lensing data 
pertaining to a given cluster. The specihc relation between 
the potential and the X-ray emission depends on the equa¬ 
tion of state and the temperature structure of the gas. It 
is sufficient to fix these two (and the connection between 
the dark matter and spatial gas distribution) in order for 
the method to work. Other assumptions do not affect the 
principle of our approach. 

The available observational data are (i) the line-of-sight 
integrated X-ray flux, (ii) the emission-weighted gas temper¬ 
ature, and (iii) the gravitational lensing effects of the cluster 
that give rise to, e.g. , coherent distortions of the images of 
background sources. It is especially the combination of X-ray 
and lensing measurements that promises to break the degen¬ 
eracies arising from projection effects. In particular, the X- 


ray flux is most sensitive to the gas fraction and the physical 
extent of the system along the line-of-sight. The emission- 
weighted temperature is most sensitive to the depth of the 
three-dimensional potential well, and the shear held is most 
sensitive to the integrated gravitational potential (with the 
nice feature of being indifferent to the gas content). 

For the sake of demonstration, we investigate two 
classes of three-dimensional potentials: (i) a single, isolated 
cluster, and (ii) two well separated clusters projected onto 
each other along the line-of-sight. In both cases, we describe 
the intracluster plasma as an isothermal ideal gas with spa¬ 
tially constant mean molecular weight. Fixing the functional 
form of the gravitational potential, we simulate “observed” 
X-ray hux maps, X-ray spectra, and lensing distortion maps. 
Given these synthetic observations, we search the parame¬ 
ter space of this functional form. We minimize an appropri¬ 
ate function which contains contributions from all three 
types of data. As we shall show, the best-ht model param¬ 
eters reproduce the input potential very reliably. Moreover, 
the degeneracy between the one- and two-cluster solutions is 
removed. Attempts to fit a functional form different from the 
one simulated result in a very poor goodness-of-fit relative 
to the goodness-of-fit for the correct functional form. 

We start (§^ by specifying the explicit and implicit as¬ 
sumptions we make and provide a concise description of the 
observations. We proceed by relating the observables to the 
underlying gravitational potential (m. In the same section, 
we present the functional form of the potential we choose. 
The combination of all observables, and the models to de¬ 
scribe them, allows us to write down a x^ statistic (§^, 
which we minimize in order to find the most probable solu¬ 
tion for a set of data in the framework of a specific model. 
In §^, we simulate observations of clusters with a specific 
density prohle by mimicking real observations along with 
their errors, and demonstrate the ability to recover the cor¬ 
rect (input) density prohle from the projected quantities. 
We then try to ht a wrong model for the simulated sys¬ 
tem and show how we fail in doing so. In ^ we present 
the difficulties of the /3-ht model to recover the right clus¬ 
ter parameters. In we discuss future implications of this 
method and present our conclusions. We consider this paper 
as a simple, but necessary, hrst step towards lifting the line- 
of-sight degeneracy in galaxy clusters. We present the basic 
ideas here, postponing more detailed studies to later work. 


2 MODEL POTENTIAL AND OBSERVABLES 

We employ spherically symmetric cluster models and assume 
that the X-ray emitting intracluster gas is in hydrostatic 
equilibrium with the cluster gravitational potential. We as¬ 
sume that the gas is isothermal, and the mean molecular 
weight is constant throughout the cluster. These assump¬ 
tions determine the density profile of the gas. In order to 
normalize the gas density, we fix the ratio between gas mass 
and dark mass within a given radius. 

Then, specifying the three-dimensional cluster potential 
is sufficient to describe both the X-ray emission and the 
lensing properties of the cluster. Unfortunately, we do not 
know this three-dimensional potential a priori. Rather, we 
must elect a functional form for it, based on some additional 
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information, which can for instance be taken from nnmerical 
simulations. 

There is growing evidence that the averaged radial 
structure of numerically simulated dark halos can well be 
described by a universal, two-parameter family of density 
profiles p(r). 


p{r) 


Pb 

x{l + x)^ 


( 1 ) 


where x is the radius in units of a scale radius rg. This shape 
of the density profile results independently of the parameters 
of the background cosmological model, and for halos with a 
broad range of masses (Navarro, Frenk, & White 1996; Cole 
& Lacey 1996; Huss, Jain, & Steinmetz 1997). On the obser¬ 
vational side, Carlberg et al. (1997) have shown that the ve¬ 
locity dispersion profiles of observed clusters are compatible 
with density profiles of the form (|^. If the X-ray emitting 
gas is isothermal and in hydrostatic equilibrium with the 
dark-matter profile (|^, its flux profile has a flat core despite 
the cusp in the density profile. Density profiles with small 
core radii or central singularities better fit the observations 
of giant arcs. The latter require cluster density profiles with 
much smaller cores than inferred from X-ray observations 
(see e.g. Narayan & Bartelmann 1997 for a review). 

If the gravitational potential of the density distribution 
(^) is normalized such that < 1 ? —> 0 for a; ^ oo, it can be 
written 

$(r) = -47rGpg r! . (2) 

X 


We replace the parameter ps by the virial mass, by which 
we mean the mass contained within the radius raoo which 
encloses an average overdensity of Sc = 200 ^ Since the mass 
within radius r is 


M{r) 


dTrpsrg 


ln(l -I- x) 


X 

1 - 1 - 1 . 


raoo is determined by 



ln(l -f- c) 


c 

1 + c. 


= 200 p , 


(3) 

(4) 


where p is the mean cosmic density, and c = is a 

concentration parameter. 

An alternative source of information for the functional 
form of the potential could be derived from the observed 
projected X-ray flux profile. This leads to the famous deriva¬ 
tion of the /3-fit. We discuss the comparison between the 
adequacy of the two different functional forms later in §^. 


2.1 X—ray Emission 

An isothermal gas in hydrostatic equilibrium with a poten¬ 
tial <E> has a gas density of 

Pgas(r) = pgas.o exp [-(l>(r) - l>o)] , (5) 

where the index ‘ 0 ’ refers to some fiducial radius ro- ^ is the 
potential in units of the square of a fiducial velocity Vth of 
the gas particles, 

^ This choice of tic can be viewed as merely a change of variables. 
The actual value for the overdensity within the virialized region 
may change as function of the background cosmology, and is of 
no particular importance here. 


^ 4>(r) , ( 6 ) 

fh being the mean mass per particle. For a mixture of 75% 
hydrogen and 25% helium (by mass), which we henceforth 
adopt, m ~ 10“^'^ g. For ro, we choose the virial radius, 
r 2 oo. We further adapt Pgas,o such that the total gas mass 
within the virial radius is a fraction /gas (hereafter called 
“gas fraction”) of the total mass. 


Itt 



dr pgas(r) 


/gas Af (r20o) 

/gas Y r^^oo 200p , 


(7) 


where the last equality follows from the definition of r 2 oo. 

The emissivity of the gas due to thermal bremsstrahlung 
at position x in the energy range Ea, < E < E^ is 

jy.{x-Ea,Eh) = 5.53 X ergcm”® s~^ 


X 

C kT 

V 


UeV. 

Vcm“3/ 



/ 

Ea\ / 

EbXl 

X 

exp ( 


kTj. 


Assuming complete ionization, the electron density is 
Tie = 0.52m“^pgas. The flux S'x(C) received from the two- 
dimensional position / within the cluster is the line-of-sight 
integral 


5x(CEa,Eb) = ^^^^^^^ J dlMi,l-,E'a,E^) , (9) 

where the factor (l + z)^ accounts for redshifting the photons 
and for the ratio between luminosity distance and angular- 
diameter distance, and Ef^, = (1 -|- z)Ea,h- In real observa¬ 
tions, Sx{^',Ea,Eh) is further convolved with the detector 
response function. The flux can then be converted to photon 
numbers by means of the bremsstrahlung spectrum, the de¬ 
tector area, and the exposure time. Likewise, the observed 
photon spectrum is determined by the number of photons 
per energy bin [Ei, Ei+i], Ea < Ei < Eh- 


2.2 Gravitational Lensing 

The gravitational lensing effects of the density profile (|^ 
have been calculated elsewhere (Bartelmann 1996). Given 
the potential (|^), the effective lensing potential is 

'‘-iSsI/'*'*- <“> 

where Da, Ds, and Dds are the angular-diameter distances 
from the observer to the cluster, to the sources, and from the 
cluster to the sources, respectively. The lensing convergence 
K and shear components 71,2 are then 

K = ^(V*,!! +V’,22 ) 

71 = , 72 ='!(’,12 ( 11 ) 

where indices i preceded by a comma denote partial deriva¬ 
tives with respect to Xi. The lensing properties of mass pro¬ 
files of the form (|^) have been worked out by Bartelmann 
(1996). 

Gravitational lensing leads to coherent distortions of 
the images of background galaxies. Image ellipticities, which 
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can be quantified by, e.g., the quadrupole tensor of their 
surface brightness distribution, measure the two-component 
reduced shear 


9i = 


7i 


1 — K 


( 12 ) 


If the lens has a critical curve, an ambiguity arises in the gi 
because of the parity change upon crossing the critical curve. 
An unambiguous measure of the ellipticity is then provided 
by the distortion Si, 


Si = 


2 gi 

^ + 9i + 92 


(13) 


Since galaxies do not usually appear circular, 5 cannot 
be inferred from individual galaxies, but must be determined 
statistically by averaging over a sufficient number of galaxy 
images. The assumption underlying this inference is that the 
intrinsic orientations of the galaxies is random. The mea¬ 
sured galaxy ellipticities (to be related to the gi) are given 
by 


ei + i £2 = 


a — b 
a + b 


exp(2 i if) 


(14) 


with a and b the major and minor axes of the ellipse, respec¬ 
tively, and ip its orientation (position angle). The unlensed 
observed ellipticities follow the two-dimensional distribution 


Pe(ei,e2) = 


exp(-|£|^u, ^) 

TTcr? [1 - exp(-(Tr^)] 


(15) 


with (Je ~ 0.15 {e.g. Miralda-Escude 1991; Tyson & Seitzer 
1988; Brainerd, Blandford, & Small 1996). An iterative pro¬ 
cedure to derive S from galaxy ellipticities has been de¬ 
scribed by Seitz & Schneider (1995). 

Deep observations {e.g. Small et al. 1995) find galaxy 
surface number densities of 40 — 50 arcmin“^ down to a 
magnitude limit of i? ~ 25. According to Lilly et al. (1995), 
the average redshifts of such sources fall within 0.8 — 1. If 
we want to average over 10 galaxies for each local esti¬ 
mate of the distortion, the intrinsic resolution limit for any 
such distortion map is 30”. The uncertainty in the lo¬ 
cal determination of 5 can be estimated by the variance of 
the N' galaxy ellipticities used to determine S, divided by 
(N' - 1)1/^ 


3 COMBINED FUNCTION 

What question can we answer by calling statistics to our 
assistance? We do not know a priori whether a certain model 
provides a good description to the data. We can, however, 
find answers to the three following questions: 

(i) Given the data, what are the best parameters to de¬ 
scribe them, in the framework of a specific model! 

(ii) For these best parameters found earlier, how likely is 
the model, given the data ? 

(iii) Given the data, which of n competing models is the 
most likely? 

The first answer is provided by the x^ minimization, 
the second by the goodness-of-fit (GoF) evaluation, and the 
third by comparing GoF values as obtained for the n differ¬ 
ent models. By “model” we mean a functional parameteriza¬ 
tion of the three-dimensional gravitational potential of the 
system under consideration. 


The ability to answer the aforementioned three ques¬ 
tions is pending on our ability to constitute a decent x^ 
statistic. The x^ statistic can be easily interpreted if the er¬ 
ror estimate is accurate, and the error distribution is Gaus¬ 
sian or close to Gaussian. 

Our analysis makes use of the different sensitivity of 
the observables to the potential parameters. The x^ statistic 
should therefore take into account all observables simulta¬ 
neously. For the clarity of presentation, however, we present 
the various terms in the statistic separately and combine 
them later. 


3.1 The Temperature Term: Xt 

The first term in the x^ statistic deals with the e missio n- 
weighted temperature (“temperature” hereafter). In ^2.l| we 
described the photon counts in the As photon energy bins 
from which the temperature is estimated. The assumption 
of cylindrical symmetry, the independence of temperature 
on projected radius due to the assumption of isothermality, 
and the poor spatial resolution of the observations lead us 
to consider only one annulus, centered on the X-ray flux 
centroid, for the temperature evaluation. 

The overall number of phot ons is taken into account 
elsewhere (the flux term, see 33 .2|) , and we must avoid taking 
it into account twice, or otherwise the terms would not be 
independent. We therefore normalize the photon number in 
each energy bin by the total observed number of photons in 
all energy bins. 

In normalized units, the temperature term of the x^ 
statistic is 


2 ^K-n,(£;,)l^ 

Xt — > ^ -TTXT- 

^ {^t) 

1=1 


(16) 


The normalized photon count in energy bin Ei is nf. Given 
the model temperature and assuming bremsstrahlung radi¬ 
ation, the model for the cluster and the X-ray background 
predicts nj{Ei) photons in the same bin The error in the 
denominator has two contributions that add up in quadra¬ 
ture: CTx = '^T,i + <^T, 2 - The two kinds of measurement er¬ 
ror are the instrumental error and the background radiation 
that has to be estimated in each frequency bin. These errors 
are identical for all models, since we do not consider dif¬ 
ferent X-ray background radiation models. We assume the 
average X-ray background radiation signal is known, so there 
is no explicit DG offset of the photon number counts. Fluc¬ 
tuations in the background, though, should still be taken 
into account. A gross approximation for the contributions 
of these two terms can be the square root of the number of 
observed photons, if the observational errors are all due to 
Poisson noise. We thus have 


{^T.l)" + (4,2)" 4 . (17) 

Recall that here, too, we normalize by the overall number 
of observed photons. One has to stay away from very low 
photon number counts, where the result is biased by the 


§ When more than one annulus is taken into account, the l.o.s. 
integration with the relevant emission weighting must be carried 
out as well and added to x^- 
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lower limit of detecting no photons at all, and the error 
symmetry breaks down. 

If a model assumes more than one cluster, with differ¬ 
ent temperatures, one cannot avoid the integration needed 
to calculate the weighted sum that yields the expected pro¬ 
jected temperature. 

Note that longer integration time can reduce the rela¬ 
tive importance of but cannot help reduce the relative 
importance of the background radiation noise (crT, 2 ) as long 
as it is not due to a short temporal variation. 


3.2 The Flux Term: Xs 

The term for the flux is similar to the term for the tem¬ 
perature. In both cases the data is the number of photons. 
Two distinct differences exist between the two: In the flux 
term, the independent data are numbers of photons in spa¬ 
tial pixels, and the important measure is the actual number, 
so it can not be normalized by the total number observed. 

For Np pixels, with the pixel centered on ^i, 
measured photons in the pixel, and photons expected 

from the model, the flux term is written as 


2 _ [-^7 “ -^ 7 ^) 1 ^ 

Xs — ^ - , jxa - 

^ {of? 


(18) 


Similarly to the ax calculation, here too we have the same 
two contributions which we can approximate by (a-g? ~ 
N‘. In the case of the flux, we are interested in both the 
absolute number of photons and their spatial distribution in 
the projected two-dimensional map as a function of 


3.3 The Ellipticity Term: Xs 


The data “unit” for the shear field, as explained in §2.2, is 
an area of typically 0.2 arcmin^ in which there are enough 
background galaxies (~ 10) to average over, for deriving 
the mean reduced distortion (c/. eq. in this area ele¬ 
ment ((5)). The error in the derived distortion in each bin is 
model independent and can be calculated either by the dis¬ 
persion about the average distortion in a given area, or by 
taking a non-lensed region, deriving the intrinsic ellipticity 
distribution for the same galaxy population, and dividing 
by the square root of the number of galaxies in each bin. 
The two methods give similar results of as ~ 0.03. Since the 
area element sizes are identical across the cluster, so is the 
error for the average ellipticity values. The x^ term for the 
ellipticities is readily written 


= — 

,-_l O 


(19) 


where, as usual, is the distortion expected from the 

model about the position The sum is over all regions, i, 
for which the ellipticity is evaluated {Ns regions altogether). 


3.4 The Combined x^ 

As stated earlier, the idea is to search for x^ minima in 
the potential parameter space, using all observables simul¬ 
taneously. Say we have specihed a functional form for the 


potential ‘I>(f^ that involves only two htting parameters, 
plus one parameter for the gas fraction. The X-ray temper¬ 
ature, the X-ray flux, and the distortion field in any bin, are 
all functionals of this potential. They can all be combined 
to result in one (complicated, non-linear) function. So can 
the data be combined. The overall x^ statistic has therefore 
Ne + Np + Ns — S degrees of freedom (d.o.f.), and is simply 
the sum 

X — Xt + Xs + Xi5 ■ (20) 


Notice that the number of d.o.f. for this function is not the 
sum of the number of d.o.f. when each individual term is 
considered separately. 

Different models may have different numbers of fitting 
parameters [e.g. two spherical symmetric clusters along the 
1 . 0 .s. may be specified by hve or six (depending on the uni¬ 
versality of /gas) plus the separation between the two clus¬ 
ters]. A fair comparison between models must include this 
simple fact. 

The x^ minimization leaves us with a Xmin for each 
model, and an estimate of the best htting parameters for 
this model. In order to assess to what extent a specihc model 
provides an adequate description for the data, the GoF is 
calculated according to 


f Ndoi Xmin \ 

V 2 ’ 2; 


( 21 ) 


with r the incomplete gamma function. The GoF interpre¬ 
tation rests on two assumptions: (i) that the data that went 
into the x^ calculation are independent (so that the d.o.f. 
calculation truly represents the d.o.f. of the data and the 
model), and (ii) that the errors are distributed in a Gaus¬ 
sian fashion and uncorrelated. Validation of the second as¬ 
sumption can be carried out by inspection of the residuals 
distribution. If errors are indeed all due to a Poissonian pro¬ 
cess in the data collection, we have reasons to believe that 
by the central limit theorem, the errors are Gaussian. The 
prudent policy of taking large bins (in the relevant context 
for each observable), pays off by producing minimally cor¬ 
related errors and independent processed data points. 


4 DEMONSTRATION BY SIMULATIONS 
4.1 Model Specification 

We can now proceed and apply our technique to idealized 
test cases. We consider two such cases: either one isolated 
cluster, or two clusters projected onto each other along the 
line-of-sight. In the first case, all observables are completely 
specihed by three parameters, viz. the two parameters of the 
dark-matter prohle, for which we take the virial mass Mvir 
and the scale radius Vs, and the gas fraction, /gas (by weight) 
within the virial radius. 

Assuming that the gas fraction is “universal”, we have 
hve parameters to describe two clusters, plus their mutual 
distance. If the clusters are sufficiently distant such that 
their gas distributions do not signihcantly overlap, their ex¬ 
act separation does not matter. This applies once they are 
separated by more than about the sum of their scale radii. If 
they are so close to each other that they are in the process of 
merging, their gas distributions become more complicated. 
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Figure 1. Examples for the simulated data we use. Panel (a): Simulated X-ray flux map. The contours are spaced by a factor of 10^^^ 
in units of counts per pixel. The pixel size is 13” X 13”, the field size is 7' X 7' (the field has 32 X 32 pixels). Panel (b): Simulated X-ray 
spectrum, overlaid with a fit to the bremsstrahlung spectrum (dashed curve). The best-fit temperature is given in the plot, together with 
its 1 a error. Panel (c): Distortion map produced from simulated background galaxy ellipticities. The same potential was used as for the 
X-ray data in the other panels. The length of the lines indicates the modulus of <5, their orientation shows the direction of 5. 


especially because shocks form, hydrostatic equilibrium does 
not apply, and their dark-matter distributions are deformed. 
We assume here that the two clusters are sufficiently well 
separated such that their gas distributions do not interact, 
and then five parameters suffice to characterize their X-ray 
and lensing properties. 

We consider simulated clusters at a redshift of Zc = 
0.2, with a gas fraction of /gas = 10%, scale radii of = 
0.25Mpc, and a total virial mass of Mvir = 10^® M©. 
When there are two clusters, this is the sum of the individual 
virial masses. The lensed sources are put at a redshift of 

Zb = 1 . 

We choose energy bins such as to mimic the energy 
resolution of the ASCA SIS (Tanaka, Inoue, & Holt 1994), 



which results in Ne = 121 energy bins between ifa = 0.1 keV 
and Eh = 12 keV. The energy dependence of the effective de¬ 
tector area is modeled like that of the ROSAT HRI. The 
spectral energy distribution yields the emission-weighted 
temperature, which equals T for isothermal gas in a sin¬ 
gle cluster, or in a double cluster where both components 
have equal mass. For the noise in the photon counts, we use 
Poisson noise plus an additional background, for which we 
choose a DC level of 3 x 10“'* s“^ arcmin”^, and a Gaus¬ 
sian distribution of the variation about this level with the 
standard deviation of ( 73,2 = with A^b the number of 

background photons per exposure. This is in approximate 
agreement with the background noise in the ROSAT PSPC 
(Snowden et al. 1995). We ignore any energy dependence of 
the X-ray background. An example for a flux map and a 
spectrum simulated this way is shown in panels (a) and (b) 
of Fig. 1^. Throughout, we have assumed an exposure time 
of 10 ksec. 

For the background galaxies, we choose a surface num¬ 
ber density of 40arcmin“^ at a redshift of = 1. Their 
positions are random, and their unlensed ellipticities are 
drawn from the two-dimensional distribution of eq. (jl^ with 
(Te = 0.15. The galaxies are then distorted by the lensing ef- 


Table 1. Parameters obtained from fitting a single cluster with 
a single cluster. A4vir is the virial mass, Tb is the scale radius, 
/gas is the gas fraction, and G is the goodness-of-fit according to 
eq. 3 a errors are given. On the whole, the input parameters 

are well recovered, and lie all within the 1 cr contour level of the 
minimization. 




Parameter 




Mvir 


/gas 

G 


0 

1 

[h“* Mpe] 

[%] 

[%] 

input 

10.0 

0.250 

10.0 

- 

best fit 

10.0 ±0.3 

0.250 ± 0.004 

10.0 ± 0.3 

48.4 


feet of the simulated clusters, and the distortion 5 is deter¬ 
mined using the iterative algorithm provided by Schneider 
& Seitz (1995). An example for the distortion map 5 created 
by the lensing effect of the cluster whose X-ray emission is 
shown in panels (a) and (b) of Fig. is displayed in panel 
(c) of the same figure. 


4.2 Single-cluster case 

We begin with “observations” created from a single cluster, 
and try to fit them with a model of the same functional 
form as used in the simulation, consisting of a single clus¬ 
ter. The best-fit parameters and the goodness-of-fit G at the 
minimum are given in Tab. The per degree of free¬ 
dom in this case is 1.001 which, for the number of degrees 
of freedom we have (Adof = 2 x 1024 -E 121 — 3 = 2166), 
yields a goodness-of-flt of G = 48.4%. Examples with differ¬ 
ent realizations of the synthetic data show that these results 
are typical. As the table shows, the input cluster parame¬ 
ters are well reproduced, contours in the Mvir-^s and the 
Afvir-/gas planes are shown in Fig. |^. 
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Figure 2. Cuts through the parameter space for fitting synthetic 
observations simulated with one cluster. The parameters are the 
virial mass Mvir, the scale radius rg, and the gas fraction within 
the virial radius /gas- Upper panel: contours in the Mvir- 

rs plane; lower panel: contours in the Mvir-/gas plane. The 

contours are la, 2a, and 3 a confidence levels. The cross marks 
the best-fit parameters, the triangle the parameters of the input 
model. 


4.3 Double-cluster case 

We now proceed to “observations” simulated with two clus¬ 
ters, with virial masses Mvir,i = (1 — m) (Mvir,i + Mvir, 2 ) 
and Mvir ,2 = ttl (Mvir,i + Mvir, 2 ), projected onto each other 
along the line-of-sight. The first question is whether it is 
possible to significantly detect that the input cluster is dou¬ 
ble. This is the case if an attempt to fit the data with one 
cluster only results in a best-fit x^ which yields an un¬ 
acceptable goodness-of-fit. We create synthetic data with 
two clusters, keeping the scale radii Vs = 0.25/i“^Mpc, 
the gas fraction /gas = 10%, and the total cluster mass 
A/vir,i + MviT ,2 = 10^^ Mq constant. We then vary the mass 


Table 2. Results from attempts at fitting with one cluster 
synthetic data that were simulated with two clusters. The in¬ 
put models consist of two clusters whose scale radii rs,i ,2 = 
0.25 Mpc, gas fraction /gas = 10%, and total mass Mvir,i + 
Mvir ,2 = 10^^ Mq are kept constant, while their mass ratio 
m = Mvir i/Mvir ,2 is varied. The table shows the mass Mvir, scale 
radius rg, and gas fraction /gas of the best-fitting single-cluster 
model. The x^ is to be compared to the number of degrees of free¬ 
dom, A^dof = 2166. G is the goodness-of-fit according to eq. (i)- 
3 O' errors are given. 


m 


Parameter 



G 

[%] 

Afvir 

[1014 Mq] 

[h~^ Mpc] 

/gas 

[%] 

1:1 

13.0 ±0.2 

0.275 ± 0.002 

6.8 ±0.1 

3129 

< 10-4 

1:2 

12.9 ±0.1 

0.271 ±0.001 

7.0 ±0.1 

2868 

< 10-4 

1:3 

12.6 ±0.1 

0.268 ±0.001 

7.4 ±0.1 

2634 

< 10-4 

1:4 

12.4 ±0.2 

0.266 ±0.003 

7.7 ±0.2 

2504 

< 10-4 

1:5 

11.7±0.1 

0.265 ±0.001 

8.3 ±0.1 

2481 

2 X 10-4 

1:6 

11.5 ±0.2 

0.263 ± 0.002 

8.4 ±0.2 

2425 

7 X 10-3 


ratio m = Mvir,i/A7vir,2. We investigate the cases m = {1:1, 
1:2,..., 1:6}. Typical results are summarized in Tab. |^. The 
table shows that for all mass ratios, the single-cluster mod¬ 
els fail to interpret the data acceptably. In turn, this im¬ 
plies that we can significantly distinguish between single- 
and double-cluster cases even if the mass ratio is fairly large. 

Since for large Vdof, the distribution approaches 
a Gaussian with mean Vdof = 2166 and variance (j ^2 = 
(2A'dof)^^^ ~ 65.8, the formal significance for rejecting the 
single-cluster hypothesis is ~ 15(7^2 for m = 1:1, and 
~ 4(7^2 for m = 1:6. 

The expected trend is noticed, as for m —> 0 the min¬ 
imization should converge to the single cluster result. By 
further examination of the figures in the table we notice 
that the total mass of the system is always overestimated 
by about 10 — 30%, the scale radius is always overestimated 
by 5 — 10%, and the gas fraction is always underestimated 
by 15 — 30% (for these values of m). The quoted errors do 
not thus represent the true errors, because the systematic 
errors from assuming the wrong model are ignored. The in¬ 
terpretation of this deviation is as follows: the minimization 
routine “detects” too low a temperature for the amount of 
flux it “sees”. It therefore tries to increase the amount of 
flux without changing the temperature, by widening the po¬ 
tential well {i.e. increasing Vs) without making the well sub¬ 
stantially deeper or equivalently without creating an unac¬ 
ceptable mismatch with the leasing distortion. This in turn 
ends up in attributing higher enclosed mass and a bit too 
high flux rate. The cure for the latter is achieved by the 
reduction of /gas- This explains the false parameters we get 
out of the minimization. 

Having seen that we can significantly reject the hypoth¬ 
esis that the synthetic data were simulated with a single clus¬ 
ter, we should now ask whether we get an acceptable GoF 
for the double-cluster model. And furthermore, how well 
can we recover the parameters of the individual clusters? 
For that purpose, we use the same double-cluster data that 
were created earlier once again, and fit them with a double¬ 
cluster model. Table ^summarizes the results. The number 
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Table 3. Results from fitting with a double-cluster model synthetic data that were simulated with two clusters, m is the mass ratio, 
A^vir,l ,2 are the virial masses, ra,i ,2 the scale radii, /gas is the gas fraction (assumed to be the same in both clusters), and G is the 
goodness-of-fit from eq. (Ell). 3cr errors are given. The input parameters are all well recovered. The errors are somewhat larger than in 
the single-cluster case. 



m 



Parameter 


/gas 

[%] 

G 

[%] 

A/vir,l 

[10i4 

Afvir,2 

M©] 

[h-^ 

rs,2 

Mpc] 

input 

1:1 

5.0 

5.0 

0.25 

0.25 

10.0 

- 

best fit 

1.0 ±0.4 

5.0 ± 1.4 

5.0 ± 1.4 

0.25 ±0.02 

0.25 ±0.02 

10.1 ±0.3 

23.1 

input 

1:2 

6.7 

3.3 

0.25 

0.25 

10.0 

- 

best fit 

0.49 ± 0.09 

6.7 ±0.6 

3.3 ±0.5 

0.25 ±0.01 

0.25 ±0.01 

10.0 ±0.3 

26.0 

input 

1:3 

7.5 

2.5 

0.25 

0.25 

10.0 

- 

best fit 

0.33 ± 0.04 

7.5 ± 0.3 

2.6 ±0.3 

0.251 ±0.002 

0.251 ± 0.003 

10.0 ±0.2 

24.1 

input 

1:4 

8.0 

2.0 

0.25 

0.25 

10.0 

- 

best fit 

0.26 ±0.03 

8.0 ±0.3 

2.1 ± 0.2 

0.251 ±0.003 

0.250 ±0.004 

10.0 ±0.2 

24.5 

input 

1:5 

8.3 

1.7 

0.25 

0.25 

10.0 

- 

best fit 

0.22 ±0.02 

8.4 ±0.2 

1.8 ±0.2 

0.251 ±0.002 

0.250 ± 0.002 

10.0 ±0.2 

27.9 

input 

1:6 

8.6 

1.4 

0.25 

0.25 

10.0 

- 

best fit 

0.17 ±0.01 

8.6 ± 0.1 

1.5 ±0.1 

0.251 ±0.002 

0.249 ± 0.002 

10.0 ±0.2 

27.9 


of degrees of freedom is now reduced by two, = 2164. 
The values of X^/-^dof are now typically ~ 1.02, resulting 
in goodness-of-fit values of G ~ 25%. The input parameters 
are all well recovered. The 3 ct errors are somewhat larger 
than in the case of one cluster. They are largest for mass ra¬ 
tio m = 1:1, namely ~ 28% for Mvu, ^ 8% for rs, and ~ 3% 
for /gas, and they decrease to a few percent for smaller mass 
ratios. For examples, we show in Fig. ^ two cuts through 
the parameter space of a double-cluster model with mass 
ratio m = 1:3. The upper panel shows contours in the 
Mvir.i-A/vir ,2 plane, the lower panel shows x^ contours in 
the rsx-rs ,2 plane. The fairly large elongation of the x^ el¬ 
lipses in the former case illustrates the comparatively large 
uncertainty in the masses: within a fairly broad range, it is 
possible to increase or decrease one mass at the expense of 
the other. 

We also ran simulations where the clusters had different 
scale radii, Csp yf rs, 2 . When the less massive cluster has a 
smaller scale radius, it is recovered less precisely, because it 
is masked by the larger Vs of the dominant, more massive 
cluster. Even then, the masses of the individual clusters are 
well recovered. 


5 A COMPARISON TO /3 FITS 


The conventional way to interpret X-ray observations is to 
azimuthally average the flux map and fit the functional form 


5'x(r) oc 


1 + 



21 -3/3/2-I-1/2 


(23) 


to the resulting flux profile. Assuming that the X-ray emit¬ 
ting gas is isothermal and in hydrostatic equilibrium with a 
spherically symmetric gravitational potential, the total mass 
implied by eq. (^ is 


Mp{r) 


3/3 r kT 
Gfh 1 + ’ 


(24) 


where x = rjr^. We apply this technique to the flux map 
shown in Fig. The model (^^ provides an excellent fit 
to the flux profile, with (3 = 0.74 and = 74.9/i“^kpc 
{cf. Fig. ^). At riim = 0.49/i“^Mpc, the flux profile drops 
below the background noise. At that radius, the spectral 
temperature of T = (5.1 ± 0.4) keV yields, together with 
the other parameters, M/ 3 (nim) = (2.0 ± 0.2) x 10^^ Mq 
(3ct errors). Given these results, we can further determine 
the gas fraction required to explain the total X-ray flux. 
At 3 ct confidence (only noise included), it turns out to be 

/ga= = (18 ± 1)%. 

The flux map in Fig. y was simulated using two clusters 
of Mvir.i = 5 X 10^"^ Mq = Mvir ,2 and Xs = 0.25h“^Mpc, 
so that the total mass within the radius accessible to X- 
ray observations should be Mtotai(nim) = 4. 26 X 
The input gas fraction was /gas = 10%. Of course, /g^s from 
the (3 fit is now the gas fraction within the observable ra¬ 
dius riim rather than the virial radius Tvir, with rum ~ rvir/3, 
hence the prime on /gas. The gas fraction of the input model 
slightly depends on r. At rum, it is /g^s = 9.2% rather than 
10%. /gas as obtained from the /3-fit technique therefore over¬ 
estimates the true gas fraction by a factor of ~ 1.8 — 2.1. 
Obviously, the estimates from the (3 fit differ substantially 
from the true values despite the /3 fit’s being excellent and 
the gas being in hydrostatic equilibrium within each of the 
two clusters. 

Part of the gross discrepancy between the true gas frac¬ 
tion and that inferred from the (3 fit comes from the fact 
that fitting the X-ray observations alone does not give any 
clue as to the structure of the cluster along the line-of-sight. 
Other contributions emerge from the attempt to rely on the 
X-ray data alone without appealing to the lensing data. 

For a single-cluster model with Mvir = 10^® Mq instead 
of the two-cluster model with the same total mass and a 
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r [Mpc] 
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Figure 3. Two cuts through the parameter space for fitting syn¬ 
thetic observations simulated with two clusters. There are five 
parameters in total; the two virial masses Mvir i 2 ? two scale 
radii rs,i, 2 , and the gas fraction within the virial radius /gas (as¬ 
sumed to be the same in both clusters). The mass ratio is m = 1:3. 
Upper panel: contours in the Mvir,i-Afvir 2 plane; lower panel: 

contours in the rs,i—rs ,2 olane. The contours are for the same 
confidence levels as in Fig. g Crosses mark the best-fit parame¬ 
ters, triangles the input parameters. 

mass ratio of m = 1:1, the P-fit mass is M^(riim) = (3.6 ± 
0.4) X 10^^ M©, while the true mass within rum is 3.7 x 
10^^ Mq. Therefore, in the single-cluster case, the /3-fit mass 
is fairly accurate within the observable radius. The gross 
overestimate of the gas fraction in the double-cluster case is 
thus largely caused by the underestimate of the total mass. 

In addition, the /3 fit profile (^^ implies that the gas 
and dark matter density profiles flatten off at radii smaller 
than the core radius rc. The true (input) dark matter density 
profile has a central cusp oc r“^. The isothermal gas density 
profile (P) approaches the center exponentially, oc exp(—Ax). 
The actual central gas density is higher than deduced from 


Figure 4. The azimutha ly averaged flux profile (diamonds) of 
the cluster shown in Fig. ^ , overlaid with a /3 fit profile (c/. eq. 
dashed line). The parameters of the fit (the core radius rc and 
P) are given in the figure. At rum = 0A9h~^ Mpc, the flux pro¬ 
file drops below the background level. The gas fraction within 
this radius, chosen such that the total cluster X-ray luminosity 
is reproduced, is /'g^g = (18 ± 1) %. The total mass within rum, 
implied by the ^d-fit parameters together with the temperature 
given in Fig. ^ and eq. (^), is Mp{rii^) = (2.0 ± 0.2) X Mg 
(3 cr errors). 

and therefore the actual total gas mass required to 
reproduce the observed X-ray flux is smaller than inferred. 
Even if we use the /3 model to interpret “observations” sim¬ 
ulated with only one cluster, the best-£t gas fractions ob¬ 
tained are still systematically too high by ~ 20 — 40%. 

We also performed the counter experiment of simulat¬ 
ing data with a King - (i.e. /3 = 1) rather than the NEW 
profile, and fitting them with the NEW profile. In this case, 
the /3-fit technique recovers the input parameters very well, 
including the gas fraction. When the core radius is small, 
rc ^ 0.2 h~^ Mpc, the fit with the NEW profile fails. It 
yields a marginally acceptable goodness-of-£t of G = 8% 
when the core radius is larger, rc ^ 0.25 Mpc. How¬ 
ever, the best-fit mass is then Mvir = (11.4±0.5) x 10^^ Mg 
instead of the input Mvu = 10^® Mg. 


6 SUMMARY AND OUTLOOK 

Resolving the l.o.s. density profile of what appears to be a 
single cluster is not a hopeless task. In this paper, we have 
presented an approach that may ultimately lead to a clear 
distinction between different l.o.s. profiles. The key idea is 
to combine all the available information for the cluster, us¬ 
ing simultaneously the X-ray data (their spatial and energy 
distribution), and the gravitational leasing properties of the 
cluster(s). 

Two additional pieces of information were left out of 
the calculation. The first is the redshift distribution of clus¬ 
ter galaxies, because of the possible complication due to ve¬ 
locity bias that may interfere with the mass estimate, con- 


© 0000 RAS, MNRAS 000, 000-000 






10 M. Bartelmann & T.S. Kolatt 


tamination by non-member galaxies along the line-of-sight, 
and triple-valued zones. We precluded this information from 
the analysis even though it may help demonstrate the exis¬ 
tence of a bimodal distribution in the case of two clusters. 
The second piece of information is the distribution of back¬ 
ground galaxy sizes. This was left out because the intrinsic 
size distribution is broader than the ellipticity distribution, 
and consequently the additional constraints gained from in¬ 
cluding magnification effects are fairly weak. 

We restricted our investigation to the question of how 
well a single cluster can be distinguished from two well sep¬ 
arated clusters along the l.o.s. We have demonstrated, by 
using realistic simulations of cluster observations, that a 
single-cluster model for two clusters along the l.o.s. can be 
rejected, using the method, on a ^ 4 — 15 ct level, depending 
on the mass ratio between the two clusters. 

The true (input) parameters of the system, i.e. the total 
mass Mvir within a certain overdensity level, the scale radius 
Ts, and the gas content /gas, can be recovered with typical 
(3 ct) fractional accuracies of a few percent for all parame¬ 
ters of single clusters. In the double-cluster case, the errors 
are largest when the mass ratio is close to unity, and they 
decrease to a few percent for smaller mass ratios. There is 
no good control over the separation between the clusters in 
the case of a two-cluster system. 

We have further shown how wrong results for the cluster 
parameters can be obtained by using the /3-fit that allegedly 
provides an appropriate ht for the X-ray flux data. Most of 
this effect can be ascribed to the fact that the /3-fit tech¬ 
nique is unable to recognize whether an apparent cluster 
is single or double. The method we propose in this paper 
does not suffer from that drawback, and hence we propose 
it supersedes the /3-fit for mass and gas-fraction estimates. 

We note that there is an increasing number of clusters 
for which there is evidence that they consist of two projected 
clusters rather than a of a single one. A well-known exam¬ 
ple for this is Abell 1689. Our choice of the singular NFW 
density profile is well motivated by numerical simulations 
(Navarro et al. 1996; Cole & Lacey 1996; Huss et al. 1997), 
and by observations which demonstrate that the mass pro¬ 
file derived from galaxy velocity data does not flatten off at 
small radii (Carlberg et al. 1997). In addition, strong grav¬ 
itational lensing requires cluster cores to be much smaller 
than inferred from X-ray observations alone, if cores exist 
at all. 

Reality spans a much broader range than what we ex¬ 
amined in this paper. To begin with, two clusters along the 
l.o.s. can be in the process of merging. In that case hydro¬ 
static equilibrium ceases to be a reasonable assumption and 
so does the isothermal model. Shocks due to the merging 
process heat the intracluster medium in an inhomogeneous 
fashion that is difficult to model. Hydrodynamic simulations 
of clusters may be useful in modeling the shock layer and its 
effect on the various X-ray observations. 

Then, even for an isolated cluster which does not ex¬ 
perience any merging, there may exist cooling flows that 
invalidate the assumption of isothermality (for the validity 
of hydrostatic equilibrium and isothermality in the presence 
of cooling flows, see e.g. Waxman & Miralda-Escude 1995). 
There is hope these can be actually observed and may be 
azimuthally averaged over in order to regain the ability to 
model the cluster. 


Another disturbing caveat may lie in the spherical sym¬ 
metry we assume. Even though X-ray observations usually 
are circular on the sky for clusters (and not elliptical), this 
may be partially attributed to selection effects. Optically de¬ 
fined clusters show much more pronounced two-dimensional 
elliptical shapes in the galaxy distribution {e.g. Plionis, Bar- 
row, & Erenk 1991). Although the ellipticity of the potential 
is smaller than that of the mass distribution, some of it must 
remain. A natural generalization of the current work would 
be to investigate families of elliptical potentials (Bartelmann 
& Kolatt 1997). By introducing the axis ratio as one of the 
free parameters, there is a continuous transition between 
spherical symmetry and elongated elliptical cluster when 
both are assumed to be in a hydrostatic equilibrium. Using 
the same statistic as we used here, with this additional 
free parameter, should result in an estimate for the cluster 
elongation. In addition, mass models obtained from large 
arcs in some clusters can help constrain the morphology of 
the projected cluster mass distribution. 

The removal of some of the projection effects by the 
means presented in this paper will allow a better under¬ 
standing of the cluster environment and a safer use of clus¬ 
ters as large scale structure probes. These are two big leaps 
forward. 
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